Parallel Processing using Distributed Ruby

The Ruby programming language provides simple distributed object support with the Distributed Ruby (DRb) module. Methods can be invoked on objects running in other processes on the same or different machines. Ruby also provides thread support, though threads are application-level and will not utilize multiple processors. Simple experiments were performed to determine if these facilities could be used to speed up a text manipulation application via parallel processing on multiple machines.

The sample program is here.

A supervisor program is run on one machine, and workers are run on multiple ports on multiple machines. The supervisor first reads filenames from the source directory into a queue. It creates one thread per worker. Each thread repeatedly reads the next filename from the queue, reads the file contents, sends contents via blocking RPC using DRb to the worker, and writes the response to a new file.

The test case was simple string substitutions against a set of 1007 files. The files were obtained from the Linux Kernel release 2.6.16.18; contents of directories (excluding subdirectories): kernel, include/linux, lib, mm, scripts, Documentation. The test cluster is the same as that described in Genezzo Cluster Hardware. The 3 machines are of roughly comparable performance, with bogomips ratings of 1278 (ratchet), 1148 (gigabyte), and 1090 (pavilion). The supervisor is run on host gigabyte.

The following are the run times for various combinations of worker processes:

gigabyte (localhost)ratchetpavilion time in seconds
0 (no DRb)0011.2
10012.7
1116.3
20015.3
10111.1
11010.7
0116.7
0216.2
0226.0
01011.9
0335.9 (best)
0446.3
1228.0

So the best time was obtained with no local worker process and three remote worker processes per machine. This was true when the text substitions were repeated (looped) 30 times inside each call. Without this looping it was more efficient to perform all the work in-process without the overhead of communication with workers:

gigabyte (localhost)ratchetpavilion time in seconds
0 (no DRb)001.2 (best)
1002.7
0224.1

Note the 30.times versus 1.times loops are being used to simulate different computational loadings (relative to communications expense) in the workers. It is not a tuning parameter. Other parameters, such as workers per machine and whether to run workers on the supervisor host are dependent on the specific application, machine, and network types. Multiprocessor/multicore machines will certainly support more workers per machine, and DRb should be a good way to utilize these machines since Ruby doesn't support OS-level threads.

In a real deployment of this type of system another option would be to use the NFS file system to distribute the file contents, instead of DRb. DRb would only be used to communicate the file names. NFS is probably better suited to communicate large blocks of data across the network.